Improvements of Hidden Chunk Models

نویسندگان

  • Harald Höge
  • Panji Setiawan
چکیده

The statistical properties of segments [8] using a specific acoustic model called the hidden chunk model (HCM) is investigated. We call the sequence of feature vectors assigned to a segment a chunk of length `. The HCM still assumes that the feature vectors are statistically independent. In contrast to hidden Markov model (HMM) we introduce emission probabilities which depend on `. Segment error rates (SERs) are calculated on a database with over 33 million chunks aligned to 607 segments. The HCM achieves more than 10 % absolute improvement in SER compared to the HMM. Based on the estimated Shannon’s entropy, the proposed HCM model paves the way to create acoustic models which are heading towards the lowest possible SER.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tagging Complex Non-Verbal German Chunks with Conditional Random Fields

We report on chunk tagging methods for German that recognize complex non-verbal phrases using structural chunk tags with Conditional Random Fields (CRFs). This state-of-the-art method for sequence classification achieves 93.5% accuracy on newspaper text. For the same task, a classical trigram tagger approach based on Hidden Markov Models reaches a baseline of 88.1%. CRFs allow for a clean and p...

متن کامل

Bitext Alignment for Statistical Machine Translation

Bitext alignment is the task of finding translation equivalence between documents in two languages, collections of which are commonly known as bitext. This dissertation addresses the problems of statistical alignment at various granularities from sentence to word with the goal of creating Statistical Machine Translation (SMT) systems. SMT systems are statistical pattern processors based on para...

متن کامل

مدل ترجمه عبارت-مرزی با استفاده از برچسب‌های کم‌عمق نحوی

Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...

متن کامل

Error-driven HMM-based Chunk Tagger with Context-dependent Lexicon

This paper proposes a new error-driven HMMbased text chunk tagger with context-dependent lexicon. Compared with standard HMM-based tagger, this tagger uses a new Hidden Markov Modelling approach which incorporates more contextual information into a lexical entry. Moreover, an error-driven learning approach is adopted to decrease the memory requirement by keeping only positive lexical entries an...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010